Goto

Collaborating Authors

 math problem


Start-ups are racing to revolutionise mathematics with AI

New Scientist

Mathematicians have never been so sought after by the world's richest people. At universities across the world, academics are seeing their colleagues mysteriously disappear and join private companies. Some of these companies are household names, like OpenAI and Google, but others are newly formed and just months old, hoping to capitalise on a moment in which mathematics is seen as the secret ingredient with which to improve artificial intelligence - which may in turn transform mathematics itself. "Last May, I was honestly kind of grieving for my scientific identity," says Ken Ono, who in 2025 went on leave from a professorship at the University of Virginia to join Axiom Math, a start-up aiming to build a maths-focused AI. Ono had been asked by a different company, called Epoch AI, to help craft a set of hard-to-solve maths problems that would test AI's problem-solving ability .


Can you best a math Olympiad? Test your skills with the world's largest database of problems.

Popular Science

MathNet contains 30,000 free math problems collected over half a century. More information Adding us as a Preferred Source in Google by using this link indicates that you would like to see more of our content in Google News results. The International Mathematical Olympiad was first held in Romania in 1959. Breakthroughs, discoveries, and DIY tips sent six days a week. In 1959, countries around the world sent their most talented students to Romania to compete in the first-ever International Mathematical Olympiad (IMO).


The Strange Origin of AI's 'Reasoning' Abilities

The Atlantic - Technology

It involves 4chan, of all places. In July 2020, 4chan's video-game discussion board looked much like the rest of the notorious online forum. There were elaborate, libidinal fantasies involving "whores" and "dragon cum," and comments on how long a gamer had to wait "before my dick can get up for another beating," as one put it. And yet, as the gamers discussed such things, they were also making a discovery of significance to the AI industry. Some of them were playing, a new text-based role-playing game that was essentially an AI version of .


Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision

Neural Information Processing Systems

Current AI alignment methodologies rely on human-provided demonstrations or judgments, and the learned capabilities of AI systems would be upper-bounded by human capabilities as a result. This raises a challenging research question: How can we keep improving the systems when their capabilities have surpassed the levels of humans?


JiuZhang3.0: Efficiently Improving Mathematical Reasoning by Training Small Data Synthesis Models

Neural Information Processing Systems

Mathematical reasoning is an important capability of large language models~(LLMs) for real-world applications.To enhance this capability, existing work either collects large-scale math-related texts for pre-training, or relies on stronger LLMs (\eg GPT-4) to synthesize massive math problems. Both types of work generally lead to large costs in training or synthesis.To reduce the cost, based on open-source available texts, we propose an efficient way that trains a small LLM for math problem synthesis, to efficiently generate sufficient high-quality pre-training data.To achieve it, we create a dataset using GPT-4 to distill its data synthesis capability into the small LLM.Concretely, we craft a set of prompts based on human education stages to guide GPT-4, to synthesize problems covering diverse math knowledge and difficulty levels.Besides, we adopt the gradient-based influence estimation method to select the most valuable math-related texts.The both are fed into GPT-4 for creating the knowledge distillation dataset to train the small LLM.We leverage it to synthesize 6 million math problems for pre-training our JiuZhang3.0




A New AI Math Startup Just Cracked 4 Previously Unsolved Problems

WIRED

Axiom says its AI found solutions to several long-standing math problems, a sign of the technology's steadily advancing reasoning capabilities. Five years ago, mathematicians Dawei Chen and Quentin Gendron were trying to untangle a difficult area of algebraic geometry involving differentials, elements of calculus used to measure distance along curved surfaces . While working on one theorem, they ran into an unexpected roadblock: Their argument depended on a strange formula from number theory, but they were unable to solve or justify it. In the end, Chen and Gendron wrote a paper presenting their idea as a conjecture, rather than a theorem. Chen recently spent hours prompting ChatGPT in the hopes of getting the AI to come up with a solution to the still unsolved problem, but it wasn't working.



Investigating Bias: A Multilingual Pipeline for Generating, Solving, and Evaluating Math Problems with LLMs

arXiv.org Artificial Intelligence

Large Language Models (LLMs) are increasingly used for educational support, yet their response quality varies depending on the language of interaction. This paper presents an automated multilingual pipeline for generating, solving, and evaluating math problems aligned with the German K-10 curriculum. We generated 628 math exercises and translated them into English, German, and Arabic. Three commercial LLMs (GPT-4o-mini, Gemini 2.5 Flash, and Qwen-plus) were prompted to produce step-by-step solutions in each language. A held-out panel of LLM judges, including Claude 3.5 Haiku, evaluated solution quality using a comparative framework. Results show a consistent gap, with English solutions consistently rated highest, and Arabic often ranked lower. These findings highlight persistent linguistic bias and the need for more equitable multilingual AI systems in education.